[SYCL] Add RT dependency on interface layer for offloading #2

KseniyaTikhomirova · 2025-06-05T13:53:38Z

This is part of the SYCL support upstreaming effort. The relevant RFCs can
be found here:

https://discourse.llvm.org/t/rfc-add-full-support-for-the-sycl-programming-model/74080
https://discourse.llvm.org/t/rfc-sycl-runtime-upstreaming/74479

The SYCL runtime is device-agnostic and uses Unified Runtime (GitHub -
oneapi-src/unified-runtime) as an external dependency. This Unified Runtime
serves as an interface layer between the SYCL runtime and device-specific
backends. Unified Runtime has several adapters that bind to various backends.

NOTE: UR is considered as temporal solution until llvm-project/offload is
fully functional and is able to replace UR.

This commit adds:

fetching UR, UR build as dependency, document with a short overview of UR
with links to repos and documentation.

KseniyaTikhomirova · 2025-06-05T13:54:17Z

For reviewers
Example of install structure for L0 adapter:

.../ssfork_llvm$ ninja -C $build_llvm install | grep "level_zero"
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/ze_api.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/ze_ddi.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/ze_ddi_common.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/zes_api.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/zes_ddi.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/zet_api.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/zet_ddi.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/layers/zel_tracing_api.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/layers/zel_tracing_ddi.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/layers/zel_tracing_register_cb.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/./include/level_zero/loader/ze_loader.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/common/examples_level_zero_helpers.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/common/examples_level_zero_helpers.c
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/level_zero_shared_memory
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/level_zero_shared_memory/level_zero_shared_memory.c
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/level_zero_shared_memory/CMakeLists.txt
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/ipc_level_zero
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/ipc_level_zero/ipc_level_zero.c
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/share/doc/Runtimes/examples/ipc_level_zero/CMakeLists.txt
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/include/umf/providers/provider_level_zero.h
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/lib/libur_adapter_level_zero.so.0.12.0
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/lib/libur_adapter_level_zero.so.0
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/lib/libur_adapter_level_zero.so
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/lib/libur_adapter_level_zero.so.0.12.0
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/lib/libur_adapter_level_zero.so.0
-- Up-to-date: /localdisk2/ktikhomi/repo/ssfork_llvm/install/release/lib/libur_adapter_level_zero.so

KseniyaTikhomirova · 2025-06-05T13:59:27Z

@tahonermann, @dvrogozh, @asudarsa, @aelovikov-intel
I consider this PR as the next step after the very first PR with libsycl project structure: #1. That PR has not been published to upstream yet but we did agree on the content.

Would be nice to start reviewing this step earlier.

KseniyaTikhomirova · 2025-06-16T10:17:25Z

kindly ping: @tahonermann, @dvrogozh, @asudarsa, @aelovikov-intel
For some reason I can't add you to this PR as reviewers

dvrogozh · 2025-06-17T18:10:33Z

libsycl/cmake/Modules/FetchUnifiedRuntime.cmake

+    set(UMF_LINK_HWLOC_STATICALLY ON CACHE INTERNAL "static HWLOC")
+  endif()
+
+  fetch_adapter_source(level_zero


You note in the PR description that "UR is considered as temporal solution until llvm-project/offload is
fully functional and is able to replace UR". I afraid that if it will be merged it will very well become an actual solution which will be quite hard to remove. For example, existing UR depends on 4 adapters - are you sure that code for all adapters will be easily/at all accepted into llvm-project/offload project? I do not believe that such a temporary solution is the right approach. Instead, it's better to focus on llvm-project/offload directly, limit the scope for initial support (Intel GPUs) and go from that.

@sergey-semenov could you please help to answer this since this approach had been discussed before I joined upstreaming activity.

AFAIK UR presence in upstream was discussed and not really greeted in community. Although folks made an agreement to start with UR.

Good point @dvrogozh
I see the email discussions and RFC discussions about this issue. But I am not able to find any communication on what we agreed on.

I don't think we've had any pushback on UR as a short-term dependency to unblock rt upstreaming in the RFCs (beyond being asked to run this by the LLVM board, which we have). I believe the current plan is to bring liboffload to functional parity with UR this year, which is when we're going to switch to it in both intel/llvm and upstream. @RaviNarayanaswamy @alycm please correct me if I'm wrong on any of this.

@sergey-semenov is correct. liboffload is being worked on, currently most of the contribution is done by CodePlay
. For the short term there was no objection from the community to use UR for offloading.

Codeplay, unless CodeSourcery are helping too! :-)

But yes, we're working on liboffload. Using liboffload is the long-term goal, but it is not yet mature enough to fully support SYCL-RT.

There is a liboffload adapter in Unified Runtime, so you can run SYCL-RT --> Unified Runtime --> liboffload. We're using this to drive development and for testing. But most SYCL features don't work yet.

asudarsa · 2025-06-17T20:24:40Z

kindly ping: @tahonermann, @dvrogozh, @asudarsa, @aelovikov-intel For some reason I can't add you to this PR as reviewers

Hi @KseniyaTikhomirova Thanks for ping. I will look at this today.

asudarsa · 2025-06-18T15:34:19Z

libsycl/docs/DesignDocs/UnifiedRuntime.rst

@@ -0,0 +1,26 @@
+=====================


Nit: Any reason why this is this not directly under docs?

Thanks

UR is implementation details (design) of SYCL RT. I believe that under docs we should keep user visible things like guides, FAQ, release notes and other.
libcxx also splits documents in this way https://github.com/llvm/llvm-project/tree/main/libcxx/docs
intel/llvm splitting is also very similar https://github.com/intel/llvm/blob/sycl/sycl/doc/design/UnifiedRuntime.md

KseniyaTikhomirova · 2025-06-23T15:33:08Z

@tahonermann, @dvrogozh, @asudarsa, @aelovikov-intel, @sergey-semenov
I believe the question about UR presence is answered. Kindly ping to review & approve if you have no objections.

asudarsa · 2025-06-23T23:48:42Z

libsycl/docs/DesignDocs/UnifiedRuntime.rst

+
+.. _unified runtime:
+
+Overview


We can avoid a sub-section (Overview) here. We can add this if we add more details to this document.

Thanks

removed in 7fb24ad

asudarsa · 2025-06-23T23:48:55Z

libsycl/docs/DesignDocs/UnifiedRuntime.rst

+Overview
+========
+
+The Unified Runtime project serves as an interface layer between the SYCL


Suggested change

The Unified Runtime project serves as an interface layer between the SYCL

The Unified Runtime (UR) project serves as an interface layer between the SYCL

updated in 7fb24ad

asudarsa

Document changes look good. Couple of nits.

Thanks

LLVM prevents the sm_32_intrinsics.hpp header from being included with a #define SM_32_INTRINSICS_HPP. It also provides drop-in replacements of the functions defined in the CUDA header. One issue is that some intrinsics were added after the replacement was written, and thus have no replacement, breaking code that calls them (Raft is one example). This commit backport the code from sm_32_intrinsics.hpp for the missing intrinsics. This is the second try after PR llvm#143664 broke tests.

The function already exposes a work list to avoid deep recursion, this commit starts utilizing it in a helper that could also lead to a deep recursion. We have observed this crash on `clang/test/C/C99/n590.c` with our internal builds that enable aggressive optimizations and hit the limit earlier than default release builds of Clang. See the added test for an example with a deeper recursion that used to crash in upstream Clang before this change with the following stack trace: ``` #0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13 #1 llvm::sys::RunSignalHandlers() /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 SignalHandler(int, siginfo_t*, void*) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3 #3 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0) #4 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12772:0 llvm#5 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#6 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#8 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#9 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#10 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#11 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#12 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#13 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#14 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#15 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#16 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#17 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#18 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#19 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 ... 700+ more stack frames. ```

This change adds support for the not equal operation for ComplexType llvm#141365

Summary: The allocator interface is supposed to have 16 byte alignment (to keep it consistent with the CPU allocator. We could probably drop this to 8 if desires.) But this was not enforced because the number of bytes used for the bitfield sometimes resulted in alignment of 8 instead of 16. Explicitly align the number of bytes to be a multiple of 16 even if unused.

PR llvm#141106 changed the debuginfo metdata to allow dynamic bit offsets and sizes. This caused a crash in lld when using LTO. The problem is that lazyLoadOneMetadata assumes that the metadata in question can be cast to MDNode; but in the typical case where the offset is a constant, this is not true. This patch changes this spot to allow non-MDNodes through. The included test case comes from the report in llvm#141106.

…BB_ADDR_MAP_V0). (llvm#146186) Version 2 was added more than two years ago (llvm@6015a04). So it should be safe to deprecate older versions.

This patch fixes: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:415:7: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions] lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:536:7: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions] lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:672:7: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions]

This patch introduces a new custom type `!spirv.arm.tensor<>` to the MLIR SPIR-V dialect to represent `OpTypeTensorARM` as defined in the `SPV_ARM_tensors` extension. The type models a shaped tensor with element type and optional shape, and implements the `ShapedType` interface to enable reuse of MLIR's existing shape-aware infrastructure. The type supports serialization to and from SPIR-V binary as `OpTypeTensorARM`, and emits the required capability (`TensorsARM`) and extension (`SPV_ARM_tensors`) declarations automatically. This addition lays the foundation for supporting structured tensor values natively in SPIR-V and will enable future support for operations defined in the `SPV_ARM_tensors` extension, such as `OpTensorReadARM`, `OpTensorWriteARM`, and `OpTensorQuerySizeARM`. Reference: KhronosGroup/SPIRV-Registry#342 --------- Signed-off-by: Davide Grohmann <[email protected]> Signed-off-by: Mohammadreza Ameri Mahabadian <[email protected]>

…/isGuaranteedNotToBeUndefOrPoisonForTargetNode (llvm#146728) None of these implicitly generate UNDEF/POISON

The only use of Receiver is to initialize RecExpr. This patch renames Receiver to RecExpr while removing the cast statement.

) This patch fixes the following error: ``` llvm/lib/Support/TextEncoding.cpp:274:11: error: cannot initialize a variable of type 'char *' with an rvalue of type 'const char *' 274 | char *Input = InputLength ? const_cast<char *>(Source.data()) : ""; | ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ```

In DXC, there is an option to enable all KHR extension. I would like to extend the existing `-spirv-ext` backend commandline option to have the same capability. It is like the special case for `all` execept it only adds the `SPV_KHR_*` extensions. Part of llvm#137650.

This reverts commit 988876c. Was intended to be a PR

unbreak gcc CI bots.

Refactors new/delete interceptor macros per the discussion in llvm#145087. Signed-off-by: Justin King <[email protected]>

…on (llvm#138144) Background: https://discourse.llvm.org/t/rfc-explaining-release-package-types-and-purposes/85985 So that users can understand which they should use, particularly for Windows. The original text about community builds is kept, after explaining the main release package formats. In addition, explain how to use gpg or gh to verify the packages.

…lvm#146909) The only difference is that with libc++ the summary string contains the derefernced pointer value. With libstdc++ we currently display the pointer itself, which seems redundant. E.g., ``` (std::unique_ptr<int>) iup = 0x55555556d2b0 { pointer = 0x000055555556d2b0 } (std::unique_ptr<std::basic_string<char> >) sup = 0x55555556d2d0 { pointer = "foobar" } ``` This patch moves the logic into a common helper that's shared between the libc++ and libstdc++ formatters. After this patch we can combine the libc++ and libstdc++ API tests (see llvm#146740).

…ch64 macOS version Currently failing on the arm64 macOS CI with: ``` 06:59:37 Traceback (most recent call last): 06:59:37 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-sanitized/llvm-project/lldb/test/API/commands/frame/var-dil/basics/GlobalVariableLookup/TestFrameVarDILGlobalVariableLookup.py", line 47, in test_frame_var 06:59:37 self.expect_var_path("ExtStruct::static_inline", value="16") 06:59:37 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-sanitized/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2589, in expect_var_path 06:59:37 value_check.check_value(self, eval_result, str(eval_result)) 06:59:37 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-sanitized/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 301, in check_value 06:59:37 test_base.assertSuccess(val.GetError()) 06:59:37 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-sanitized/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2597, in assertSuccess 06:59:37 self.fail(self._formatMessage(msg, "'{}' is not success".format(error))) 06:59:37 AssertionError: '<user expression 0>:1:1: use of undeclared identifier 'ExtStruct::static_inline' 06:59:37 1 | ExtStruct::static_inline 06:59:37 | ^' is not success 06:59:37 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-sanitized/lldb-build/bin/clang 06:59:37 ---------------------------------------------------------------------- 06:59:37 Ran 1 test in 2.322s 06:59:37 ``` Can't repro this locally so skipping on older macOS versions that the CI is running.

The inheritance hierarchy for `llvm::ms_demangle::Node` ([doxygen](https://llvm.org/doxygen/structllvm_1_1ms__demangle_1_1Node.html)) is a bit more involved. One thing that's missing without RTTI is the ability to determine if a node is a symbol, identifier, or type (or one would need to check for every kind). This PR adds support for `dyn_cast`, `isa`, and friends to `llvm::ms_demangle::Node`. As the type already has a `kind()`, this mainly adds `classof` to the nodes as well as some start and end markers in the `NodeKind` enum.

…lvm#141937) RFC on discourse: https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations-take-2/86606 With this commit, we add `DILabel` debug infos to the resume points of a coroutine. Those labels can be used by debugging scripts to figure out the exact line and column at which a coroutine was suspended by looking up current `__coro_index` value inside the coroutines frame, and then searching for the corresponding label inside the coroutine's resume function. The DWARF information generated for such a label looks like: ``` 0x00000f71: DW_TAG_label DW_AT_name ("__coro_resume_1") DW_AT_decl_file ("generator-example.cpp") DW_AT_decl_line (5) DW_AT_decl_column (3) DW_AT_artificial (true) DW_AT_LLVM_coro_suspend_idx (0x01) DW_AT_low_pc (0x00000000000019be) ``` The labels can be mapped to their corresponding `__coro_idx` values either via their naming convention `__coro_resume_<N>` or using the new `DW_AT_LLVM_coro_suspend_idx` attribute. In gdb, those line numebrs can be looked up using `info line -function my_coroutine -label __coro_resume_1`. LLDB unfortunately does not understand DW_TAG_label debug information, yet. Given this is an artificial compiler-generated label, I did apply the DW_AT_artificial tag to it. The DWARFv5 standard only allows that tag on type and variable definitions, but this is a natural extension and was also blessed in the RFC on discourse. Also, this commit adds `DW_AT_decl_column` to labels, not only for coroutines but also for normal C and C++ labels. While not strictly necessary, I am doing so now because it would be harder to do so later without breaking the binary LLVM-IR format Drive-by fixes: While reading the existing test cases to understand how to write my own test case, I did a couple of small typo fixes and comment improvements

This patch is part of a series that adds origin-tracking to the debugify source location coverage checks, allowing us to report symbolized stack traces of the point where missing source locations appear. This patch completes the feature, having debugify handle origin stack traces by symbolizing them when an associated bug is found and printing them into the JSON report file as part of the bug entry. This patch also updates the script that parses the JSON report and creates a human-readable HTML report, adding an "Origin" entry to the table that contains an expandable textbox containing the symbolized stack trace.

llvm#147008)

The Buildkite CI was unintentionally disabled for a few weeks. This patch fixes the CI jobs now that is has been re-enabled.

The use-case for `__is_same_uncvref` seems rather dubious, since not a single use-cases needed the `remove_cvref_t` to be applied to both of the arguments. Removing the alias makes it clearer what actually happens, since we're not using an internal name anymore and it's clear what the `remove_cvref_t` should apply to.

These changes were split off from llvm#146503. This commit makes the output directories of libclc artefacts explicit. It creates a variable for the final output directory - LIBCLC_OUTPUT_LIBRARY_DIR - which has not changed. This allows future changes to alter the output directory more simply, such as by pointing it to somewhere inside clang's resource directory. This commit also changes the output directory of each target's intermediate builtins.*.bc files. They are now placed into each respective libclc target's object directory, rather than the top-level libclc binary directory. This should help keep the binary directory a bit tidier.

This extension extends the subgroup block read and write functions defined by `cl_intel_subgroups` (and, when supported, `cl_intel_subgroups_char`, `cl_intel_subgroups_short`, and `cl_intel_subgroups_long`) to support reading from and writing to pointers to the `__local` memory address space in addition to pointers to the `__global` memory address space. It is already supported by the Intel OpenCL compiler. Co-authored-by: Victor Mustya <[email protected]>

The prepare target was depending on the output of a custom command, but wasn't the full path to that file. This tripped up CMake if the file was removed as it didn't know how to rebuild that file.

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova · 2025-07-23T16:20:45Z

hi @tahonermann, @dvrogozh, @asudarsa, @aelovikov-intel, @sergey-semenov, I created another PR for these changes since initial (this) PR was created against old branch created for internal review and has too much differences with branch published to community.
This is new PR I'd like to ask you to review #4.

Tracked at llvm#112294 This patch implements from [basic.link]p14 to [basic.link]p18 partially. The explicitly missing parts are: - Anything related to specializations. - Decide if a pointer is associated with a TU-local value at compile time. - [basic.link]p15.1.2 to decide if a type is TU-local. - Diagnose if TU-local functions from other TU are collected to the overload set. See [basic.link]p19, the call to 'h(N::A{});' in translation unit #2 There should be other implicitly missing parts as the wording uses "names" briefly several times. But to implement this precisely, we have to visit the whole AST, including Decls, Expression and Types, which may be harder to implement and be more time-consuming for compilation time. So I choose to implement the common parts. It won't be too bad to miss some cases since we DIDN'T do any such checks in the past 3 years. Any new check is an improvement. Given modules have been basically available since clang15 without such checks, it will be user unfriendly if we give a hard error now. And there are a lot of cases which violating the rule actually just fine. So I decide to emit it as warnings instead of hard errors.

Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread #1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread #1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread #1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame #2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html

Pointers and GEP are untyped. SPIR-V required structured OpAccessChain. This means the backend will have to determine a good way to retrieve the structured access from an untyped GEP. This is not a trivial problem, and needs to be addressed to have a robust compiler. The issue is other workstreams relies on the access chain deduction to work. So we have 2 options: - pause all dependent work until we have a good chain deduction. - submit this limited fix to we can work on both this and other features in parallel. Choice we want to make is #2: submitting this **knowing this is not a good** fix. It only increase the number of patterns we can work with, thus allowing others to continue working on other parts of the backend. This patch as-is has many limitations: - If cannot robustly determine the depth of the structured access from a GEP. Fixing this would require looking ahead at the full GEP chain. - It cannot always figure out the correct access indices, especially with dynamic indices. This will require frontend collaboration. Because we know this is a temporary hack, this patch only impacts the logical SPIR-V target. Physical SPIR-V, which can rely on pointer cast remains on the old method. Related to llvm#145002

…lvm#152156) With this new A320 in-order core, we follow adding the FeatureUseFixedOverScalableIfEqualCost feature to A510 and A520 (llvm#132246), which reaps the same code generation benefits of preferring fixed over scalable when the cost is equal. So when we have: ``` void foo(float* a, float* b, float* dst, unsigned n) { for (unsigned i = 0; i < n; ++i) dst[i] = a[i] + b[i]; } ``` When compiling without the feature enabled, we get: ``` ... ld1b { z0.b }, p0/z, [x0, x10] ld1b { z2.b }, p0/z, [x1, x10] add x12, x0, x10 ldr z1, [x12, #1, mul vl] add x12, x1, x10 ldr z3, [x12, #1, mul vl] fadd z0.s, z2.s, z0.s add x12, x2, x10 fadd z1.s, z3.s, z1.s dech x11 st1b { z0.b }, p0, [x2, x10] incb x10, all, mul #2 str z1, [x12, #1, mul vl] ... ``` When compiling with, we get: ``` ... ldp q0, q1, [x12, #-16] ldp q2, q3, [x11, #-16] subs x13, x13, llvm#8 fadd v0.4s, v2.4s, v0.4s fadd v1.4s, v3.4s, v1.4s add x11, x11, llvm#32 add x12, x12, llvm#32 stp q0, q1, [x10, #-16] add x10, x10, llvm#32 ... ```

Need this as `mlir/dialects/transform/smt.py` imports it: ```py from .._transform_smt_extension_ops_gen import * from .._transform_smt_extension_ops_gen import _Dialect ```

KseniyaTikhomirova requested a review from sergey-semenov June 5, 2025 13:55

dvrogozh suggested changes Jun 17, 2025

View reviewed changes

asudarsa reviewed Jun 18, 2025

View reviewed changes

KseniyaTikhomirova requested review from asudarsa and dvrogozh June 20, 2025 09:42

asudarsa reviewed Jun 23, 2025

View reviewed changes

asudarsa approved these changes Jun 23, 2025

View reviewed changes

AmrDeveloper and others added 15 commits July 2, 2025 19:25

[CIR] Implement NotEqualOp for ComplexType (llvm#146129)

e9be528

This change adds support for the not equal operation for ComplexType llvm#141365

[SHT_LLVM_BB_ADDR_MAP] Remove support for versions 1 and 0 (SHT_LLVM_…

6b623a6

…BB_ADDR_MAP_V0). (llvm#146186) Version 2 was added more than two years ago (llvm@6015a04). So it should be safe to deprecate older versions.

[X86] Add BLEND/UNPCK shuffles to canCreateUndefOrPoisonForTargetNode…

aa8e1bc

…/isGuaranteedNotToBeUndefOrPoisonForTargetNode (llvm#146728) None of these implicitly generate UNDEF/POISON

[Sema] Remove an unnecessary cast (NFC) (llvm#146703)

5e6e51b

The only use of Receiver is to initialize RecExpr. This patch renames Receiver to RecExpr while removing the cast statement.

[bazel] Add missing dep after 242996e

dfc5987

Fix wcpncpy() return value; add test.

988876c

Revert "Fix wcpncpy() return value; add test." (llvm#146752)

77d9591

This reverts commit 988876c. Was intended to be a PR

[lldb] remove do-nothing defaults in case statements,

00e071d

unbreak gcc CI bots.

asan: refactor new/delete interceptor macros (llvm#146696)

e3edc1b

Refactors new/delete interceptor macros per the discussion in llvm#145087. Signed-off-by: Justin King <[email protected]>

DavidSpickett and others added 21 commits July 4, 2025 09:02

[bazel] Port 0ceb0c3

c7d3b81

[libc++][NFC] Fixed some wrongly spelled _LIBCPP_STD_VER in comments (

a774463

llvm#147008)

[libc++] Fix tests broken on the Buildkite CI (llvm#146733)

8f6a964

The Buildkite CI was unintentionally disabled for a few weeks. This patch fixes the CI jobs now that is has been re-enabled.

[libclc] Fix target dependency

222e795

The prepare target was depending on the output of a custom command, but wasn't the full path to that file. This tripped up CMake if the file was removed as it didn't know how to rebuild that file.

remove version from win library name

554decd

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Merge branch 'main' into reviewed_addlibsycl

e8e2ca9

rest of the changes for version removal

b824da0

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

remove extra code for implib name

81c9efd

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

fix comments

e502190

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

Add UR dependency

cbd4127

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

build only sycl lib

b047308

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

remove extra cmake handling for ur

687ea05

Signed-off-by: Tikhomirova, Kseniya <[email protected]>

KseniyaTikhomirova force-pushed the 1_add_ur branch from 8b552fa to 687ea05 Compare July 23, 2025 16:14

KseniyaTikhomirova closed this Jul 23, 2025

dvrogozh mentioned this pull request Jul 23, 2025

[SYCL] Add RT dependency on interface layer for offloading #4

Closed

	The Unified Runtime project serves as an interface layer between the SYCL
	The Unified Runtime (UR) project serves as an interface layer between the SYCL

[SYCL] Add RT dependency on interface layer for offloading #2

[SYCL] Add RT dependency on interface layer for offloading #2

Conversation

KseniyaTikhomirova commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KseniyaTikhomirova commented Jun 5, 2025

Uh oh!

KseniyaTikhomirova commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KseniyaTikhomirova commented Jun 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RaviNarayanaswamy Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asudarsa commented Jun 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KseniyaTikhomirova commented Jun 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asudarsa left a comment

Choose a reason for hiding this comment

Uh oh!

KseniyaTikhomirova commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

126 participants

KseniyaTikhomirova commented Jun 5, 2025 •

edited

Loading

KseniyaTikhomirova commented Jun 5, 2025 •

edited

Loading

RaviNarayanaswamy Jun 18, 2025 •

edited

Loading